77 research outputs found

    SHARP: Harmonizing Galaxy and Taverna workflow provenance

    Get PDF
    International audienceSHARP is a Linked Data approach for harmonizing cross-workflow provenance. In this demo, we demonstrate SHARP through a real-world omic experiment involving workflow traces generated by Taverna and Galaxy systems. SHARP starts by interlinking provenance traces generated by Galaxy and Taverna workflows and then harmonize the interlinked graphs thanks to OWL and PROV inference rules. The resulting provenance graph can be exploited for answering queries across Galaxy and Taverna workflow runs

    KGRAM Versatile Inference and Query Engine for the Web of Linked Data

    Get PDF
    International audienceQuerying and linking distributed and heterogeneous databases is increasingly needed, as plentiful data resources are published over the Web. This work describes the design of a versatile query system named KGRAM that supports (i) multiple query languages among which the SPARQL 1.1 standard, (ii) federation of multiple heterogeneous and distributed data sources, and (iii) adaptability to various data manipulation use cases. KGRAM provides abstractions for both the query language and the data model, thus delivering unifying reasoning mechanisms. It is implemented as a modular software suite to ease architecting and deploying dedicated data manipulation platforms. Its design integrates optimization concerns to deliver high query performance. Both KGRAM's software versatility and performance are evaluated

    From scientific workflow patterns to 5-star linked open data

    Get PDF
    International audienceScientific Workflow management systems have been largely adopted by data-intensive science communities. Many efforts have been dedicated to the representation and exploitation of prove-nance to improve reproducibility in data-intensive sciences. However , few works address the mining of provenance graphs to annotate the produced data with domain-specific context for better interpretation and sharing of results. In this paper, we propose PoeM, a lightweight framework for mining provenance in scientific workflows. PoeM allows to produce linked in silico experiment reports based on workflow runs. PoeM leverages semantic web technologies and reference vocabularies (PROV-O, P-Plan) to generate provenance mining rules and finally assemble linked scientific experiment reports (Micropublications, Experimental Factor Ontology). Preliminary experiments demonstrate that PoeM enables the querying and sharing of Galaxy 1-processed genomic data as 5-star linked datasets

    Composing Multiple Variability Artifacts to Assemble Coherent Workflows

    Get PDF
    International audienceThe development of scientific workflows is evolving towards the system- atic use of service oriented architectures, enabling the composition of dedicated and highly parameterized software services into processing pipelines. Building consistent workflows then becomes a cumbersome and error-prone activity as users cannot man- age such large scale variability. This paper presents a rigorous and tooled approach in which techniques from Software Product Line (SPL) engineering are reused and ex- tended to manage variability in service and workflow descriptions. Composition can be facilitated while ensuring consistency. Services are organized in a rich catalog which is organized as a SPL and structured according to the common and variable concerns captured for all services. By relying on sound merging techniques on the feature mod- els that make up the catalog, reasoning about the compatibility between connected services is made possible. Moreover, an entire workflow is then seen as a multiple SPL (i.e., a composition of several SPLs). When services are configured within, the prop- agation of variability choices is then automated with appropriate techniques and the user is assisted in obtaining a consistent workflow. The approach proposed is com- pletely supported by a combination of dedicated tools and languages. Illustrations and experimental validations are provided using medical imaging pipelines, which are rep- resentative of current scientific workflows in many domains

    Publication, partage et réutilisation de règles sur le Web de données

    Get PDF
    Session 4 : Web sémantiqueNational audienceL'objectif de notre travail présenté dans cet article est de favoriser la réutilisation de règles sur le Web, basée sur les principes du Web de données. En complément de données RDF, de schémas RDFS ou d'ontologies OWL, des règles peuvent être publiées et partagées sur le Web. Notre approche consiste à considérer des bases de règles comme des sources de données, représentées en RDF, qui peuvent être publiées, partagées et interrogées sur le Web de données, permettant ainsi la sélection et la réutilisation des règles pertinentes et utiles dans un contexte ou une application particuliers. Nous envisageons la sélection de règles selon des annotations qui les décrivent ou selon leur contenu, ou les deux. Nous avons implémenté et mis en œuvre notre approche avec le moteur Corese/KGRAM permettant le traitement de données centralisées ou distribuées sur le Web de données et nous avons conduit des expérimentations sur la sélection des règles de la sémantique de OWL pour des données basées sur des ontologies populaires

    Semantic Federation of Distributed Neurodata

    Get PDF
    International audienceNeurodata repositories federation is increasingly needed to implement multi-centric studies. Data federation is difficult due to the heterogeneous nature of distributed data repositories and the technical difficulties faced in addressing simultaneously multiple data sources. This paper describes a data federation system based on a semantic mediation layer and an advanced distributed query engine able to interface to multiple heterogeneous neurodata sources. Both performance and usability indicators are shown, demonstrating the soundness of the approach and its practical feasibility

    Fédération multi-sources en neurosciences : intégration de données relationnelles et sémantiques

    Get PDF
    National audienceLa fédération et l'interrogation multi-sources de données est un besoin croissant. En neurosciences collaboratives, les entrepôts de données sont hétérogènes et ne peuvent être relocalisés hors des sites d'origine, pour des raisons historiques, juridiques ou éthiques. Cet article présente un système de recherche d'informations qui s'interface à des entrepôts de données multiples, hétérogènes et distribués. Ce système est évalué dans le cadre d'une plateforme de neurosciences collaboratives dédiée aux études cliniques multi-centriques en termes d'utilisabilité et de performance

    KGRAM Versatile Inference and Query Engine for the Web of Linked Data

    Get PDF
    International audienceQuerying and linking distributed and heterogeneous databases is increasingly needed, as plentiful data resources are published over the Web. This work describes the design of a versatile query system named KGRAM that supports (i) multiple query languages among which the SPARQL 1.1 standard, (ii) federation of multiple heterogeneous and distributed data sources, and (iii) adaptability to various data manipulation use cases. KGRAM provides abstractions for both the query language and the data model, thus delivering unifying reasoning mechanisms. It is implemented as a modular software suite to ease architecting and deploying dedicated data manipulation platforms. Its design integrates optimization concerns to deliver high query performance. Both KGRAM's software versatility and performance are evaluated

    Federating distributed and heterogeneous information sources in neuroimaging: the NeuroBase Project.

    Get PDF
    The NeuroBase project aims at studying the requirements for federating, through the Internet, information sources in neuroimaging. These sources are distributed in different experimental sites, hospitals or research centers in cognitive neurosciences, and contain heterogeneous data and image processing programs. More precisely, this project consists in creating of a shared ontology, suitable for supporting various neuroimaging applications, and a computer architecture for accessing and sharing relevant distributed information. We briefly describe the semantic model and report in more details the architecture we chose, based on a media-tor/wrapper approach. To give a flavor of the future deployment of our architecture, we de-scribe a demonstrator that implements the comparison of distributed image processing tools applied to distributed neuroimaging data
    • …
    corecore